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Abstract.  In  this  paper  we  develop  procedures  to  make  inference  in  regression  mod- 
els about  how  potential  policy  interventions  affect  the  entire  distribution  of  an  outcome 
variable  of  interest.  These  policy  interventions  consist  of  counterfactual  changes  in  the 
distribution  of  covariates  related  to  the  outcome.  Under  the  assumption  that  the  condi- 
tional distribution  of  the  outcome  is  unaltered  by  the  intervention,  we  obtam  uniformly 
consistent  estimates  for  functionals  of  the  marginal  distribution  of  the  outcome  before 
and  after  the  policy  intervention.  Simultaneous  confidence  sets  for  these  functionals  are 
also  constructed,  which  take  into  account  the  sampling  variation  in  the  estimation  of 
the  relationship  between  the  outcome  and  covariates.  This  estimation  can  be  based  on 
several  principal  approaches  for  conditional  quantile  and  distributions  functions,  includ- 
ing quantile  regression  and  proportional  hazard  models.  Our  procedures  are  general  and 
accommodate  both  simple  unitary  changes  in  the  values  of  a  given  covariate  as  well  as 
changes  in  the  distribution  of  the  covariates  of  general  form.  An  empirical  application 
and  a  Monte  Carlo  example  illustrate  the  results. 
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1.  Introduction 

A  common  problem  in  economics  is  to  predict  the  effect  of  a  potential  policy  interven- 
tion or  a  counterfactual  change  in  the  economic  conditions  on  some  outcome  variable  of 
interest.  For  example,  economists  and  policy  analysts  might  be  interested  in  what  would 
have  been  the  wage  distribution  in  2000  had  the  workers  had  the  same  characteristics  as 
in  1990,  what  would  have  been  the  distribution  of  infant  birth  weights  for  black  mothers 
had  they  received  the  same  amount  of  prenatal  care  as  white  mothers,  the  effect  on  the 
distribution  of  food  expenditure  resultmg  from  a  change  in  income  taxes,  or  the  effect  on 
the  distribution  of  housing  prices  resulting  from  cleaning  up  a  local  hazardous- waste  site. 
More  generally,  we  can  think  of  a  policy  intervention  as  a  change  in  the  distribution  of 
a  set  of  explanatory  variables  X  that  determine  the  response  variable  of  interest  Y .  The 
policy  analysis  consists  then  in  estimating  the  effect  on  the  distribution  of  Y  of  a  change 
in  the  distribution  of  A'. 

In  this  paper  we  develop  procedures  to  make  inference  in  regression  models  about  how 
these  counterfactual  policy  interventions  affect  the  entire  marginal  distribution  of  Y .  The 
main  assumption  is  that  the  policy  does  not  affect  the  relationship  between  the  covariates 
and  outcome.  In  other  words,  the  conditional  distribution  of  }'  given  .Y  is  not  altered  by 
the  policy  intervention.  Starting  from  an  estimate  of  the  conditional  model  for  the  rela- 
tionship between  the  outcome  and  covariates,  we  obtain  uniformly  consistent  estimates 
for  functionals  of  the  distribution  functions  of  the  outcome  before  and  after  the  inter- 
vention. Examples  of  these  functionals  include  the  own  distribution  functions,  quantile 
functions,  quantile  treatment  effects,  distribution  effects,  means,  variances,  and  Lorenz 
curves.  Confidence  sets  are  then  constructed  around  the  estimates  that  take  into  account 
the  sampling  variation  coming  from  the  estimation  of  the  conditional  model.  These  confi- 
dence sets  are  uniform  in  that  they  cover  the  entire  functional  of  interest  with  pre-specified 
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probability.  The  analysis  is  based  upon  several  principal  approaches  to  estimating  condi- 
tional quantile  functions  and  conditional  distributions  functions,  including,  for  example, 
quantile  regressions  and  proportional  hazard  models. 

The  proposed  inference  procedures  can  be  used  to  analyze  the  effect  of  both  simple  in- 
terventions consisting  of  unitary  changes  in  the  values  of  a  given  covariate  as  well  as  more 
elaborated  poHcies  consisting  of  general  changes  in  the  covariate  distribution.  Moreover, 
the  counterfactual  distribution  for  the  covariates  can  correspond  to  a  known  transforma- 
tion of  the  values  of  the  covariates  in  the  population  or  to  the  covariate  distribution  in 
a  different  subpopulation  or  group.  This  variety  of  alternatives  allows  us  to  analyze,  for 
instance,  the  effect  of  a  redistribution  of  the  covariates  within  the  population  or  what 
would  have  been  the  counterfactual  distribution  of  the  outcome  in  one  subpopulation  had 
the  covariates  been  distributed  as  in  a  different  subpopulation. 

To  develop  the  statistical  inference  results,  we  establish  the  compact  or  Hadamard 
differentiability  of  the  marginal  distribution  functions  before  and  after  the  policy  with 
respect  to  the  limit  of  the  estimators  of  the  conditional  model  of  the  outcome  given  the 
covariates,  tangentially  to  the  set  of  continuous  functions.  This  result  allows  us  to  derive 
the  asymptotic  distribution  for  the  functionals  of  interest  taking  into  account  the  sampling 
variation  coming  from  the  first  stage  estimation  of  the  relationship  between  the  outcome 
and  covariates  by  means  of  the  functional  delta  method.  Moreover,  this  general  approach 
based  on  functional  differentiability  also  facilitates  to  establish  the  validity  of  convenient 
resampling  methods  to  make  uniform  inference  on  the  functionals  of  interest. 

Because  our  analysis  relies  only  on  the  conditional  quantile  estimators  or  conditional  dis- 
tribution estimators  satisfying  a  functional  central  limit  theorem,  it  applies  quite  broadly 
and  covers  such  major  methods  as  (1)  conventional  classical  regression  and  its  gener- 
alizations, (2)  quantile  regression  and  its  generalization,  (3)  duration  models,  and  (4) 
distribution  regression  models.   As  a  consequence  a  wide  array  of  techniques  is  covered; 
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in  the  discussion  we  devote  most  attention  to  the  most  practical  and  commonly  used 
methods  of  estimating  conditional  quantities. 

The  results  in  the  paper  are  related  to  the  previous  literature  on  policy  estimators. 
Stocks  (1989)  introduces  nonparametric  estimators  to  evaluate  the  mean  effect  of  pol- 
icy interventions.  Gosling,  Machin,  and  Meghir  (2000)  and  Machado  and  Mata  (2005) 
propose  policy  estimators  based  on  quantile  regression  models,  but  do  not  formally  de- 
velop limit  distribution  theory  for  these  estimators.  Imbens  and  Newey  (2006)  derive 
identification  results  and  nonparametric  estimators  for  average  policy  effects  in  structural 
nonseparable  models.  This  paper  establishes  the  Gaussian  limit  distribution  for  the  en- 
tire outcome  distribution  after  a  general  policy  intervention  for  a  variety  of  estimators 
based  on  regression  models,  including  location-scale  models,  conditional  quantile  models, 
proportional  hazard  models,  and  distribution  regression  models.  To  derive  this  result  we 
formally  establish  the  Hadamard  differentiability  of  the  counterfactual  outcome  distribu- 
tion with  respect  to  the  limit  of  the  conditional  processes,  which  is  required  to  apply  the 
functional  delta  method.  A  recent  paper  by  Firpo,  Fortin,  and  Lemieux  (2007)  studies 
the  effects  of  special  policy  interventions  consisting  of  marginal  changes  in  the  values  of 
the  covariates.  Their  approach,  based  on  a  Imearization  of  the  functionals  of  interest,  is 
clearly  different  from  ours. 

The  rest  of  the  paper  is  organized  as  follows.  Section  2  describe  methods  to  perform 
counterfactual  analysis,  setting  up  the  modelling  assumptions  for  the  counterfactual  out- 
comes and  introducing  the  policy  estimators.  Section  3  derives  limit  distribution  results 
for  the  policy  estimators  to  perform  uniform  inference  on  functionals  of  the  distribution  of 
policy  effects.  Section  4  illustrates  the  estimation  and  inference  procedures  with  numerical 
examples,  and  Section  5  concludes  with  a  summary  of  the  main  results. 


2.  Methods  for  Counterfactual  Analysis 

2.1.  The  model:  observed  and  counterfactual  outcomes.  In  our  analysis  it  is  im- 
portant to  distinguish  between  observed  and  counterfactual  outcomes.  Observed  outcomes 
come  from  the  population  before  the  policy  intervention  and  are  therefore  observable, 
whereas  counterfactual  outcomes  come  from  the  population  after  the  policy  intervention 
and  are  therefore  unobservable.  We  assume  that  the  covariates  are  observable  before  and 
after  the  policy  intervention.  The  observed  outcomes  are  used  to  establish  the  relationship 
between  the  outcome  and  the  covariates,  which,  together  with  the  observed  counterfac- 
tual distribution  of  the  covariates,  determine  the  distribution  of  the  outcome  after  the 
intervention  under  some  conditions  that  we  make  precise  below. 

For  the  purposes  of  specifying  a  model  on  how  the  counterfactual  outcome  is  generated, 
it  is  convenient  to  look  at  the  relationship  between  the  observed  outcome  and  covariates 
using  a  conditional  quantile  representation.  Let  Y°  be  the  observed  outcome,  and  X°  be 
the  pxl  vector  of  covariates  with  distribution  function  F^  before  the  policy  intervention. 
Let  Qy{u\X)  denote  the  conditional  u-quantile  of  Y°  given  X°.  The  outcome  Y°  can  be 
linked  to  the  conditional  quantile  function  via  the  Skorohod  representation: 

Y°  =  Qy{U°\X°),  where  U°  ~  U{0, 1)  independently  of  X°  ~  F^.  (2.1) 

This  representation  emphasizes  that  the  outcome  is  a  function  of  the  covariates  ai:id  the 
disturbance  U°.  In  the  classical  regression  model,  the  disturbance  is  separable  from  the 
covariates,  as  in  the  location  shift  model  described  below,  but  generally  it  need  not  be. 
Our  analysis  will  cover  both  cases. 

The  counterfactual  experiment  consists  of  drawing  the  vector  of  covariates  from  a  dif- 
ferent distribution,  i.e.,  X'^  ~  F^-,  where  F^  is  a  known  distribution  function  for  the 
covariates  after  the  policy  intervention.  Under  the  assumption  that  the  conditional  quan- 
tile function  is  not  altered  by  the  policy,  the  counterfactual  outcome  V^  is  generated 
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by         . 

■.         -        Y"  ^  Qy{U''\X''),  where  U"  ~  (7(0, 1)  independently  of  X"  ~  F^,  (2.2) 

Note  that  in  the  construction  of  the  counterfactual  outcome  we  make  the  additional 
assumption  that  the  quantile  function  Qy[u\x)  can  be  evaluated  at  each  point  x  in  the 
support  of  the  distribution  of  covariates  F^.  This  assumption  either  requires  the  support 
of  Fy  to  be  a  subset  of  the  support  of  F"-  or  that  the  quantile  function  can  be  suitably 
extrapolated  outside  the  support  of  F"-. 

The  assumptions  for  the  model  that  generates  the  counterfactual  outcome  can  be  stated 
formally  as: 

M.l  The  conditional  distribution  of  the  outcome  given  the  covariates  is  the  same  before 

and  after  the  policy  intervention. 
M.2  The  conditional  model  holds  for  all  x  £  .Y,  where  X  is  a  compact  subset  of  W 

that  contains  the  supports  of  Fy  and  Fy . 

2.2.  Types  of  Counterfactual  Changes.  We  consider  two  different  types  of  changes 
in  the  distribution  of  the  covariates: 

(1)  The  covariates  are  drawn  from  a  different  subpopulation  before  and  after  the  inter- 
vention. These  subpopulations  might  correspond  to  different  demographic  groups, 
time  periods  or  geographic  locations.  Examples  include  the  distributions  of  worker 
characteristic  in  different  years,  distributions  of  socioeconomic  characteristics  for 
black  versus  white  mothers,  or  more  generally  distributions  of  covariates  in  a  treat- 
ment group  versus  a  control  group. 

(2)  The  policy  intervention  can  be  implemented  as  a  known  transformation  of  the 
distribution  of  the  observed  covariates;  that  is  A''^  =  g{X°),  where  (/(■)  is  a  known 
function.  This  case  covers,  for  example,  unitary  changes  in  the  location  of  one 
of  the  covariates,  X'^  =  A'  +  e^  where  Cj  is  a  unitary  p  x  1  vector  with  a  one  in 
the  position  j;  or  mean  preserving  redistributions  of  the  covariates  implemented 
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as  X'^  =  (1  —  a)E[X°]  +  aX°.  This  kind  of  policies  can  be  used  to  estimate  the 
effect  on  infant  birth  weights  from  an  increase  in  the  number  of  cigarettes  smoked 
by  the  mother  during  pregnancy,  the  effect  on  food  expenditure  resulting  from  a 
change  in  income  taxes,  or  the  effect  on  housing  prices  resulting  from  cleaning  up 
a  local  hazardous-waste  site  (Stock,  1991). 

Note  that  these  two  cases  correspond  to  conceptually  different  thought  experiments.  The 
statistical  analysis  that  follows,  however,  will  cover  either  situation  without  modification. 
The  main  difference  will  be  that  the  second  case  corresponds  to  an  almost  perfectly 
controlled  experiment,  which  provides  additional  information  to  identify  more  featTires  of 
the  joint  distribution  of  the  outcome  before  and  after  the  intervention. 

2.3.  Functionals  of  interest.  To  make  inference  on  the  general  effect  on  the  outcome 
of  the  policy  intervention,  we  need  to  identify  the  distribution  and  quantile  functions  of 
the  outcome  before  and  after  the  policy.  The  conditional  distribution  associated  with  the 
quantile  function  Qy{u\x)  is  given  by: 

FY(y\x)=  f   \{QY{u\x)<y}du.      ■  :  ,:.        (2.3) 

Jo 

Given  our  assumptions  about  how  the  counterfactual  outcome  is  generated,  the  marginal 
distributions  are  given  by  •       . 

Fliy)  ;=  Pr  {Y^  <  y]  =    f  Fy{y\x)dFJ,{x),  '       (2.4) 

Jx 

with  corresponding  marginal  w-quantile  functions  ■        _ 

-      •   :  Q'y{u)  =  ini{y:Fi.iy)>u},  •  .      (2.5) 

where  j  indexes  the  status  before  or  after  the  policy,  j  €  {o,  c}.  The  u-quantile  treatment 
effect  of  the  policy  is  then  given  by 

QTEY{u)  =  Q'Y{u)-Q°y{u).  (2.6) 


Likewise,  the  y-distribution  effect  of  the  policy  is  given  by 

DEy{y)  =  Fi.{y)~F°y{y).         ,  (2.7) 

Another  functionals  of  interest  might  be  the  Lorenz  curves  of  the  observed  and  coun- 
terfactual  outcomes.  These  curves,  commonly  used  to  measure  inequality,  are  ratios  of 
partial  means  to  overall  means 

(■y 


/•y  roc 

tdPyit)/  tdFl.{t), 

-oo  J  — oo 


provided  that  the  integrals  exist  and  /^  tdFyit)  ^  0,  for  j  s  {o,c}.  More  generally,  we 
might  be  interested  in  functionals  of  the  marginal  distributions  of  the  outcome  before  and 
after  the  intervention 

Hy{y)-cl>{F^.,F,%y).  (2.8) 

These  functionals  include  distributions,  quantiles,  quantile  treatment  effects,  distribution 
effects,  and  Lorenz  curves  as  special  cases,  but  also  other  characteristics  such  as  means 
with  (^(Fy°-,Fy'.y)  =  IZo^^^y^^)  '■="  I^Y'  ™6^"  ^ff^cts  ^^i*^^  (l){F^-,F§-,y)  =  n'y  -  ii°y; 
variances  with  (p{Fy,Fy,y)  =  J^  t'^dFy{t)  —  (/iy)^  :=  {<^y)~'-,  and  variance  effects  with 
cpiF^,Ff.,y)  =  {a^yf-ia°yf' 

In  the  case  where  the  policy  consists  of  a  known  transformation  of  the  distribution  of 
the  covariates,  A'"^  =  g{X°),  we  can  also  identify  the  distribution  and  quantile  functions 
for  the  effects  of  the  policy,  A  =  1'*^  —  Y°,  by; 

F^i5)=   f    [   I  {Qy{u\g{x))  -  Qy{u\x)  <  6}  dWF°  (x)  (2.9) 

Jx  Jo 

and 

QA{u)=mi{5:FA{5)>u},  '    .         (2.10) 


In  the  rest  of  the  discussion  we  keep  the  distribution,  quantile.  quantile  treatment  effects,  and  distri- 
bution effects  functions  as  separate  cases  to  emphasize  the  importance  of  these  functionals  in  practice. 
Lorenz  curves  are  special  cases  of  the  general  functional  with  0(Fy ,  Fy,y)  =  J^^  tdFyit)/  /^  tdF^,{t), 
and  will  not  be  considered  separately. 


under  the  additional  assumption 

RP  Conditional  rank  preservation:  V^  =  U°\X° . 
See,  e.g.,  Heckman,  Smith,  and  Clements  (1997). 

2.4.  Conditional  models.  The  quantile  representation  for  the  relationship  between  out- 
come and  covariates  is  a  useful  modeling  tool  to  understand  how  the  counterfactual  out- 
comes are  generated,  but  it  is  not  necessary  for  identifying  the  marginal  distribution  and 
quantile  functions.  These  functions  depend  only  on  the  conditional  distribution  of  the 
outcome,  so  we  can  proceed  either  by  directly  specifying  a  model  for  the  conditional  dis- 
tribution, or  by  specifying  a  model  for  the  conditional  quantiles  and  then  obtaining  the 
conditional  distribution  using  the  expression  (2.3).  We  next  describe  several  principal 
methods  for  modeling  and  estimating  conditional  quantile  and  distribution  functions. 

Example  1.  Location  regression  and  generalizations.  The  inference  results  of 
this  paper  cover  the  classical  regression  model  as  well  as  its  generalizations.  The  classical 
location-shift  model  takes  the  form  '  ■        --       '  .   . 

Y  =  m{X)  +  V,    V  =  Qv{U),  '  •(2.11) 

where  U  ~  [7(0, 1)  is  independent  of  X,  and  ni{-)  is  a  location  functional,  for  example,  the 
conditional  mean.  The  disturbance  V  has  the  quantile  function  Qv{u),  and  Y  therefore 
has  conditional  quantile  function  Qy{u\x)  =  m{x)  +  Q\/{u).  This  model  is  parsimonious 
in  that  covariates  impact  the  outcome  only  through  the  location.  Even  though  this  is 
a  location  model,  it  is  clea.r  that  a  general  change  in  the  distribution  of  covariates  can 
have  heterogeneous  effects  on  the  entire  rnarginal  distribution  of  Y,  affecting  its  various 
quantiles  in  a  differential  manner.  The  regression  function  m(x)  is  most  commonly  mod- 
eled linearly  in  parameters  rn{x)  =  x' (3  and  estimated  using  least  squares  or  instrumental 
variable  methods.    The  quantile  function  Qv{u)  can  be  left  unrestricted  and  estimated 
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using  the  empirical  quantile  function  of  the  residuals.  Our  results  cover  such  common  es- 
timation schemes  as  special  cases,  since  we  only  require  the  estimates  to  satisfy  a  central 
limit  theorem. 

The  location  model  has  played  a  classical  role  in  regression  analysis.  Most  endogenous 
and  exogenous  treatment  effects  models,  for  example,  can  be  analyzed  and  estimated  using 
variations  of  this  model;  see,  e.g.,  Chap.  25  in  Cameron  and  Trivedi  (2005).  A  variety  of 
standard  survival  and  duration  models  also  imply  (2.11)  after  a  transformation,  e.g.,  the 
Cox  models  with  Weibull  hazards  and  a.ccelerated  failure  time  models,  cf.  Docksum  and 
Gasko  (1990). 

The  location-scale  shift  model  is  a  generalization  that  enables  the  covariates  impact  the 
conditional  distribution  through  the  scale  function  as  well: 

Y  =  m{X)  +  a{X)-V,    V  =  Qv{U),  (2.12) 

where  U  ~  U{0, 1)  independently  of  A',  and  cr(-)  is  a  positive  scale  function.  In  this  model 
the  conditional  quantile  function  takes  the  form  Qy{u\x)  —  m{x)  +  a{x)Q\/{u).  It  is  clear 
that  changes  in  the  distribution  of  X  can  have  a  nontrivial  effect  on  the  entire  marginal 
distribution  of  Y ,  affecting  its  various  quantiles  in  a  differential  manner.  This  model  can 
be  estimated  through  a  variety  of  means,  see,  for  example,  Rutemiller  and  Bowers  (1968) 
and  Koenker  and  Xiao  (2002). 

Example  2.  Quantile  regression.  The  quantile  regression  method  directly  models 
the  conditional  quantile  relationship 

■    .        Y  =  Qy{U\X), 

without  imposing  the  location  shift  or  the  location-scale  shift  mechanisms.  The  model 
permits  the  covariates  to  impact  Y  by  changing  not  onlj'  the  location  and  scale  of  the 
distribution  but  also  the  entire  shape.  An  early  convincing  example  of  such  effects  goes 
back  to  Doksum  (1974),  who  showed  that  regression  data  can  be  sharply  inconsistent 
with  the  location-scale  shift  paradigm.  Quantile  regression  precisely  addresses  this  issue. 
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The  leading  approach  to  quantile  regression  entails  the  approximation  of  the  conditional 
quantile  function  by  a  linear  functional  form  Qy{u\x)  =  x'P{u),  see,  e.g.,  Koenker  and 
Bassett  (1978)  and  Koenker  (2005). ^ 

Example  3.  Duration  Models.  A  common  way  to  model  the  distribution  functions 
in  duration  and  survival  analysis  is  the  Cox  model; 

Fy(y|x)  =  exp(exp(m(a:)  +  t{y))), 

where  t{y)  is  a  monotonic  function  in  y.  This  model  is  rather  rich,  yet  the  role  of  covariates 
is  limited  in  an  important  way.  In  particular  the  model  leads  to  the  following  location-shift 
regression  representation:  -        . 

t{Y)  =  mix)  +  1/  :' 

where  V  has  an  extreme  value  distribution.  Therefore  covariates  impact  the  outcome 
only  through  the  location  function.  The  estimation  of  this  model  has  been  a  subject  of 
many  studies,  e.g.,  Lancaster  (1990),  Donald,  Green,  and  Paarsch  (2000),  and  Dabrowska 
(2005).  "■  '  '    ■       ■      '        '  '■ 

Example  4.  Distribution  Regression.  Instead  of  restricting  ourselves  to  the  model 
of  the  above  kind,  we  can  consider  directly  modeling  Fy(y|x),  separately  for  each  threshold 
y.  An  example  is  the  model 

Fviylx)  =  A(m(y,x)),        .  .        .     '  ,  ,    . 

where  A  is  a  known  link  function,  and  m(y,x)  is  unrestricted  in  y.  This  specification 
includes  the  previous  example  as  a  special  case  (put  A{v)  =  exp(exp(i;))  and  rn{y,x)  = 
m{x)  +  t{y))  and  allows  for  more  flexible  effect  of  the  covariates.  The  leading  example  of 
this  specification  would  be  a  probit  or  logit  link  function  A  and  m{y\x)  =  x' (3{y),  were  [3{y) 
is  an  unknown  function  in  y  (see,  e.g.,  Foresi  and  Peracchi,  1995).  This  approach  is  similar 
to  quantile  regression  in  spirit.  In  particular,  as  quantile  regression,  this  approach  leads 


Throughout,  by  "hnear"  we  mean  specifications  that  are  Hnear  in  the  parameters  but  could  be  highly 
non-linear  in  the  original  covariates.  I.e.,  if  the  original  covariate  is  X,  then  the  conditional  quantile 
function  takes  the  form  z'P{u)  where  z  =  f{x). 
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to  the  specification  Y  =  Qy{U\X)  =  A-\m-\U,X))  where  U  ~  17(0,1)  independently 

of  X.  '  .'',,-. 

2.5.  Policy  estimators.  Given  an  estimator  for  the  conditional  distribution  F'y{y\x), 
the  marginal  distributions  for  the  outcome  can  be  estimated  by 

F^{y)  =  J  Fy{y\x)dF],{x),       '  (2.13) 

with  corresponding  quantile  functions  ■         '•  ■ 

Qi.{u)  =  mi{y:F^.{y)>u},  -■  (2.14) 

for  j  G  {o,c}.  Estimators  for  the  quantile  treatment  effects  and  distribution  effects  can 
then  be  constructed  as 

QTEyiu)  =  QUu)-Q°yiu),  (2.15) 

and 

\        DEy{y)  =  F^.{y)~F°{y).  ^  (2.16) 

For  the  general  functionals  introduced  in  (2.8),  we  can  u.se  sample  analogs 

HY{y)  =  HF?;Fy';y).  ■  (2.17) 

In  the  previous  expressions  the  estimator  for  the  conditional  distribution  can  be  ob- 
tained directly  from  a  conditional  distribution  model,  or  by  inversion  of  a  conditional 
quantile  estimator,  that  is: 

FY{y\x)  =  j   I  [Qy{u\x)  <  y]  du,  (2.18) 

where  Qy{u\x)  is  a  given  estimator  of  the  conditional  quantile  function  and  the  integral 
can  be  approximated  by  a  sum  over  a  fine  grid  of  the  interval  [0, 1].  Estimators  for  the 
distribution  and  quantiles  of  the  effects  can  be  constructed  similarly  by  replacing  the 
conditional  functions  by  their  estimators  in  the  expressions  (2.9)  and  (2.10). 
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2.6.  Inference  questions.  Common  inference  questions  that  arise  in  policy  analysis  in- 
volve features  of  the  distribution  of  the  outcome  variable  before  and  affect  the  intervention. 
For  example,  we  might  be  interested  in  the  average  effect  of  the  policy,  or  in  quantile  treat- 
ment effects  at  several  quantiles  that  measure  the  impact  of  the  policy  at  different  parts 
of  the  outcome  distribution.  More  generally,  in  this  analysis  is  very  common  that  the 
questions  of  interest  involve  the  entire  distribution  or  quantile  functions  of  the  outcomes. 
Examples  include  the  hypotheses  that  the  policy  has  no  effect,  that  the  effect  is  constant, 
or  that  it  is  positive  for  the  entire  distribution.  The  statistical  problem  is  to  account  for 
the  sampling  variability  in  the  estimation  of  the  conditional  model  to  make  inference  on 
the  functionals  of  interests.  The  next  section  provides  limit  distribution  theory  for  the 
policy  estimators.  This  theory  applies  to  the  entire  distribution  and  quantile  functions  of 
the  outcome  before  and  after  the  response,  and  therefore  is  valid  to  make  both  pointwise 
inference  about  specific  features  of  these  functions,  and  simultaneous  inference  about  the 
entire  distribution  function,  quantile  function,  or  other  functionals  of  interest. 


3.  Limit  Distribution  Theory  for  Policy  Estimators 

The  purpose  of  this  section  is  to  provide  a  set  of  simple  general  sufficient  conditions 
that  facilitate  the  main  large  sample  results  on  inference.  Even  though  the  conditions 
are  reasonably  general,  they  do  not  exhaust  all  scenarios  under  which  the  main  inferen- 
tial methods'  will  be  valid.  The  conditions  are  designed  to  cover  the  principal  practical 
approaches,  and  to  help  us  think  about  what  is  needed  for  various  approaches  to  work. 

3.1.  Estimators  of  the  Conditional  Model.  We  provide  general  assumptions  about 
the  estimators  of  the  conditional  model  for  the  relationship  between  the  outcome  and 
covariates,  which  will  allow  us  to  derive  the  limit  distribution  for  the  policy  estimators 
constructed  from  them.  These  assumptions  hold  for  commonly  used  parametric  and 
semiparametric  estimation  methods  for  conditional  distribution  and  quantile  models  such 
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as  linear  quantile  regression  and  proportional  hazard  models.  We  provide  separate  as- 
sumptions for  quantile  and  distribution  estimators,  and  then  show  that  in  both  cases  the 
ultimate  estimator  of  the  conditional  distribution  used  to  obtain  the  functionals  of  inter- 
est satisfy  a  functional  central  limit  theorem,  what  enables  us  to  give  a  unified  treatment 
for  all  the  policj'  estimators.  "" 

We  start  the  analysis  by  discussing  the  approach  based  on  quantile  models.  By  £°°((0, 1)  x 
rY)  we  denote  the  space  of  bounded  functions  mapping  from  (0, 1)  x  ^  to  E,  equipped  with 
the  uniform  metric.  The  fohowing  conditions  impose  some  restrictions  on  the  conditional 
quantile  model  and  on  the  corresponding  quantile  estimator: 

C.l  There  exists  a  conditional  density  /v(y|x)  that  is  continuous  and  bounded  above 
and  away  from  zero,  uniformly  on  y  £  y  and  x  £  X^  where  3^  is  a  compact  subset 
ofR. 

Q.l  The  estimator  of  the  conditional  quantile  function  (u,;r)  ^>  Qy(wjx)  converges  in 
law  to  a  continuous  Gaussian  process: 

^(Qy{u\x)-Qy{u\x)^  =^V{u,x),  .    .  (3.1) 

in  the  space  <'°°((0, 1)  x  ^),  where  the  ra,ndom  function  (u,  x)  h^  V{u,  x)  has  zero 
mean  and  uniformly  bounded  covariance  function  Sv'(u,  x,  u,  x)  :=  E\y{u,  x)V{u,  x)]. 

These  conditions  appear  reasonable  in  practice  when  the  outcome  is  continuous.  If  the 
outcome  is  discrete  the  condition  C.l  does  not  hold  in  the  stated  form.  However,  we 
can  deal  with  discrete  outcomes  via  the  distribution  approach.  Condition  C.l  focuses 
on  the  case  where  the  outcome  has  compact  support  with  bounded  density,  and  is  a 
reasonable  case  to  analyze  in  detail  first.  This  condition  could  be  extended  to  include 
other  cases,  through  none  of  the  subsequent  results  are  expected  to  change  in  an  essential 
manner.  Condition  Q.l  applies  to  the  main  estimators  of  conditional  quantile  functions 
under  suitable  regularity  conditions,  cf.,  Gutenbrunner  and  Jureckova  (1992),  Angrist, 
Chernozhukov,  and  Fernandez- Val  (2006),  and  Appendix  D. 
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Turning  to  the  conditional  distribution  estimators,  let  £°°{y  x  X)  denote  the  space  of 

bounded  functions  mapping  from  3^  x  A"  to  R,  equipped  with  the  uniform  metric,  where 

y  is  a  compact  subset  of  R.  The  following  condition  imposes  some  regularity  conditions 

on  the  way  the  estimator  of  the  distribution  function  should  behave. 

D.l  The  estimated  conditional  distribution  function  {u,x)  i— >  Fy(j/|x)  converges  in  law 
to  a  continuous  Gaussian  process: 

\/^  [Pyiylx)  -  Fy(j/|x))  ^  Z{y,  x),  (3.2) 

in  the  space  £°°(3^x  A'),  where  the  random  function  (y,x)  t—^  Z{y,x)  has  zero  mean 
and  uniformly  bounded  covariance  function  T,z{y,x,y,x)  :=  E[Z{y,x)Z{y,  x)]. 

This  condition  holds  for  common  estimators  of  conditional  distribution  functions,  see, 
e.g.,  Beran  (1977),  Dabrowska  (2005)  and  Appendix  D.  These  estimators,  however,  might 
produce  estimates  that  are  not  monotonic  in  the  level  of  the  outcome  y,  see,  e.g.,  Foresi 
and  Peracchi  (1995)  and  Hall,  Wolff,  and  Yao  (1999).  A  way  to  avoid  this  problem  and 
to  improve  the  finite  sample  properties  of  the  conditional  distribution  estimators  is  by 
rearranging  the  estimates.  Start  from  an  estimator  Fy{y\x)  that  satisfies  condition  D.l, 
but  it  is  not  necessarily  monotonic  in  y.  This  estimator  can  be  rearranged  with  the 
following  two  steps.  First,  construct      '  •    '  '  ■     .         ■ 

roo  />0 

Q{u\x)=  \{FY{y\x)<u]dy~  l{Fy.(y|x)  >  u}dy,  (3.3) 

Jo  J -oo 

which  is  an  estimator  of  the  conditional  quantile  function  that  is  monotone  in  the  quantile 
index  u.  Second,  invert  Q{u\x)  to  obtain  ■  .         .     , 

•  FY{y\x)  =  mi{u  :  Q{u\x)  <  y},  (3.4) 

a  monotone  estimator  of  the  conditional  conditional  function,  which,  under  the  assump- 
tion C.l,  has  the  same  first  order  limit  distribution  as  F>'(y|x),  see  Chernozhukov, 
Fernandez-Val,  and  Galichon  (2006). 
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If  we  start  from  a  conditional  quantile  model,  we  can  use  the  relationship  between 
the  distribution  function  and  the  quantile  function  to  define  the  conditional  distribution 
function  estimator  in  (2.18)  from  an  available  conditional  quantile  function  estimator 
Qy{u\x).  It  turns  out  that  if  the  original  quantile  estimator  satisfies  the  conditions  C.l 
and  Q.l,  then  the  impHed  conditional  distribution  estimator  satisfies  the  condition  D.l. 
This  result  is  convenient  for  the  following  analysis  because  it  allows  us  to  give  a  unified 
treatment  of  the  policy  estimators  based  on  both  quantile  models  and  distribution  models. 

Lemma  1.  Under  the  conditions  C.l  and  Q.l,  the  estimator  of  the  conditional  distri- 
bution function  m  (2.18)  satisfies  the  condition  D.l  with 

Z{y,x)  =  -fr{y\x)V{FY{y\x),x)."  (3.5) 

3.2.  Basic  principles.  The  derivation  of  the  limit  distribution  for  the  policy  estimators 
is  based  on  two  basic  principles  that  allow  us  to  link  the  properties  of  the  conditional 
estimators  with  the  properties  of  the  estimators  of  the  marginal  distribution  and  quantile 
functions.  First,  although  there  does  not  exist  a  direct  connection  between  conditional 
and  marginal  quantiles,  the  law  of  iterated  expectations  links  conditional  and  marginal 
distributions.  Second,  by  means  of  delta  method  we  can  switch  from  the  properties  of  the 
estimators  of  the  distribution  function  to  the  properties  of  the  estimators  of  the  quantile 
function  and  vice  versa.  The  main  difficulty  in  the  analysis  is  that  the  functionals  of 
interest  depend  on  the  entire  process  for  the  conditional  function,  so  we  need  to  resort 
to  a  functional  delta  method.  Moreover,  the  estimators  of  the  conditional  model  usually 
have  discontinuities  because  their  estimating  equations  involve  indicator  functions,  what 
further  complicates  the  analysis. 

The  key  ingredient  in  the  derivation  and  the  main  theoretical  contribution  of  the  paper 
is  to  establish  the  Hadamard  or  compact  differentiability  of  the  functionals  of  interest 
with  respect  to  the  limit  of  the  conditional  processes,  tangentially  to  the  subspace  of 
continuous  functions.  The  basic  Hadamard  differentiability  result  for  the  conditional 
distribution  with  respect  to  the  conditional  quantile  function  is  given  in  Lemma  4  in  the 
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Appendix,  and  the  differentiability  of  the  other  functionals  then  follows  by  the  properties 
of  the  Hadamard  derivative.  These  results  enable  us  to  use  the  functional  delta  method 
to  derive  all  the  following  limit  distribution  theory. 

3.3.  Limit  distribution  for  marginal  distribution  and  quantile  functions.  We  are 

now  ready  to  state  the  first  main  results  establishing  that  the  estimators  of  the  marginal 
distribution  and  quantile  functions  satisfy  a  central  limit  theorem  in  large  samples. 

Theorem  1  (Limit  Distribution  for  Marginal  Distributions).  Under  conditions  M.l, 
M.2,  and  D.l  the  estimators  of  the  marginal  distribution  functions  converge  m  law  to 
the  following  continuous  linear  functional  of  the  Gaussian  process  Z{y,x): 

^  [Fl\y)  -  F{\y))  ^    f  Z{y,x)dFl,{x)  :=  Z^{y),  (3.6) 

in  the  space  £°°{y),  where  the  random,  function  y  i— >  Z^{y)  has  zero  mean  and  covariance 
function  "         ■       ■ ;-        '■•■.-"    ■■.■■--■     '■       ■'-■     ■  ■"      ■  '  ',■      •;- 

■        ■        '         E'ziy,y):=  f    f  Eziy,x,y,x)dFj,{x)dFJ,{x):  '         '        (3.7) 

Jx  Jx 

The  convergence  holds  jointly  for  all  estimators  indexed  by  the  status  j  S  {o,  c},  with  cross 
covariance  function 

'  T:f{y,y):=   f    [  Ez{y,x,y,i)dF°^{x)dF^Ax).  (3,8) 

Jx  Jx 

Theorem  2   (Limit  Distribution  for  Marginal  Quantiles).    Under  the  conditions  M.l, 

M.2,  C.l.  and  D.l  the  estimators  of  the  marginal  quantile  functions  converge  in  law  to 

the  following  continuous  linear  Gaussian  functional:  , 

^{Q^y{u)-Q\.{u))^-f^y{Q\.{u))-'ZKQ\.[u)):=\P[u),  (3.9) 

in  the  space  /?°°((0, 1)),  where  fyiv)  =  Jx  fyiy\^)'^^xi^)  "^^^  ^^^  random  function  u  i— > 
V^{u)  has  zero  mean  and  covariance  function 

.     .  E^y{u,u):=fi.{Q'y{u))-'fiiQ\.{Ti)r'l^^{QUu),Q'y{u)).  ,       (3.10) 
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The  convergence  holds  jointly  for  all  estimators  indexed  by  the  status  j  G  {o,c},  with 
cross- covariance  function  ■   - 

S^?(u,«)  ~f°y[Q°y{u))-'f^y{Q\.{u)r'T.°i{Ql.{u),Q\,{u)).  (3.11) 

Corollary  1  (Limit  Distribution  for  Quantile  Treatment  Effects).  Under  the  conditions 
M.l,  M.2,  C.l,  and  D.l  the  estimator  of  the  quantile  treatment  effects  converges  in  law 
to  the  following  linear  functional  of  continuous  Gaussian  processes: 

V^  {OTEyiu)  -  QTEy{u)^  ^  V\u)  -  V°[u)  ■-  W[u),  ^  (3.12) 

in  the  space  /"^((0, 1)),  where  the  random  function  u  ^->  W{u)  has  zero  mean  and  covari- 
ance function 

^wiu,u):=T,°y{u,u)  +  T.''y[u,u)-Y,'{f{u,u)-T.°^[u,u).  (3.13) 

Corollary  2  (Limit  Distribution  for  Distribution  Effects).  Under  the  conditions  M.l, 
M.2,  and  D.l  the  estimator  of  the  distribution  effects  converges  in  law  to  the  following 
linear  functional  of  continuous  Gaussian  processes: 

V^  (DEyiy)  -  DEyiy))  ^  Z'[u)  -  Z\u)  :=  5(y),  ■       (3.14) 


in  the  space  i°^{y),  where  the  random  function  y  h^  S{y)  has  zero  mean  and  covariance 
function 

Es(y,y):=S°2(j/,y)  +  E|(y,y)-Sg^(?/,y)-Ef(y,y).  .        (3.15) 

Corollary  3  (Limit  Distribution  for  Differentiate  Functionals).  Let  Hy{y)  =  <t>{Fy,  Fy,  y) 
he  a  Hadamard  differ entiahle  functional  in  the  first  two  arguments,  with  derivatives  cj)'^  and 
0'j,  with  respect  to  the  first  and  second  argument.  Under  the  conditions  M.l,  M.2,  and 
D.l  the  estimator  of  the  functional  Hyiy)  defined  in  (2.17)  converges  in  law  to  the  fol- 
lowing linear  functional  of  continuous  Gaussian  processes: 

V^  (Hy{y)  -  Hy{y))  =>  <i>'^{F°,F^.,y)Z%y)  +  <P[{F^;F^.,y)Z'^{y)  :=  T{y),        (3.16) 
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in  the  space  €°°{y),  where  the  random  function  y  i-^  T(y)  has  zero  mean  and  covariance 
junction  ■ 

^Tiy,y)  :=  <P'o4>'o^°ziy,y)  +  ^'A^z{y,y)  +  0'cS|^(2/,y)  +  0;s°/(y,y),        (3.17) 

■where  4>'j  :=  <^;(F^-,  Ff ,  j/)  and  4>'^  :=  ^;(F^,F,%y),  for  j  €  {o,c}. 

Remark  1.  The  previous  Corollary  follows  from  the  functional  delta  method;  see,  e.g., 
Theorem  20.8  in  van  der  Vaart  (1998).  Examples  of  Hadamard  differentiable  functionals 
include  continuous  transformations  of  linear  functionals  such  as  means,  mean  effects, 
variances,  variance  effects,  and  Lorenz  curves;  cf.  Fernholz  (1983),  and  Barrett  and 
Donald  (2000).^  .  .  ,       . 

3.4.  Limit  distribution  for  the  estimators  of  the  effects.  For  policy  interventions 
that  can  be  implemented  as  a  known  transformation  of  the  covariate,  X'^  =  g{X°),  we  can 
also  identify  and  estimate  the  distribution  of  effects  under  the  additional  assumption  of 
conditional  rank  preservation.  The  following  results  provide  estimators  for  the  distribution 
and  quantile  functions  of  the  effects  and  limit  distribution  theory  for  them. 

Lemma  2  (Limit  distribution  for  estimators  of  conditional  distribution  and  quantile  func- 
tions). Let  Q/^{u\x)  =  Qy{u\g{x))  —  Qy[u\x)  he  an  estimator  of  the  conditional  quantile 
function  of  the  effects  Qr^{u\x).'^   Under  the  conditions  C.l,  Q.l,  and  R.P,  we  have: 

^/n{QA{u\x)-QA{u\x)^^V[u,g{x))-V{u,x):=Vg{u,x),  (3.18) 

in  the  space  ^°°((0,1)  x  X),  where  the  Gaussian  random,  function  {u,x)  ^-^  Vg{u,x)  has 
zero  mean  and  covariance  function 

--•   --     Qv{u,x,u,x)  :=  Yly{u,g{x),u,g{x))  +  T,Y{u,x,iL,x)  —  2T,y{u,g{x),u,x).  (3.19) 


Goldie  (1977)  derives  weali  convergence  results  for  Lorenz  under  very  weak  conditions  using  a  different 

approach. 

^In  the  distribution  approach,  Qy{u\x)  can  be  obtained  by  inversion  of  the  estimator  of  the  conditional 

distribution  as  in  (3.3).  '  ,      . 
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Let  F^{6\x)  =  /q  l{QAiu\x)  <  5}du  be  an  estimator  of  the  conditional  distribution,  of  the 
effects  Fa((5|x).    Under  the  conditions  C.l,  Q.l,  and  R.P,  we  have: 

V^  (Fa(<5|x)  -  Fa(<5|x))  ^  -/A(<5|2-)l/g(FA(<5jx),x)  :=  Zg{5,x),  (3.20) 

in  the  space  i°°{V  x  A!),  where  T>  =  {5  e  R  :  5  =  y  —  y,y  E  y,y  E  y},  and  the  random 
function  {6,x)  i— ^  Zg{5,x)  has  zero  mean  and  covanance  function 

nz{6,xJ,x):=fAi6\x)f^C6\x)nv{FA{5\x),x,F^{S\i),x).  (3.21) 

The  conditional  density  of  the  effect,  /a(5|x),  assum,ed  to  be  bounded  above  and  away  from 
zero,^  can  be  expressed  m  terms  of  the  conditional  density  of  the  level  of  the  outcome  as 


Ui5\x)  = 


fv  {Qy{FA{5\x)\g{x))\g{x))       fy  (Qy{FA{5\x)\x)\x) 


-1 


(3.22) 


Theorem  3  (Limit  Distribution  for  estimators  of  the  marginal  distribution  and  quantile 
functions).  Let  Fa(5)  =  J^  FA{5\x)dFy;{x)  be  an  estimator  of  the  marginal  distribution 
of  the  effects  F^iS).    Under  the  conditions  M.l,  M.2.  C.l.  Q.l.  and  R.P,  we  have: 

v^  (Fa((5)  -  Fa((5))  ->    /  Zg{6,x)dF"^{x)  ;=  Zg{5),  (3.23) 

in  the  space  C°°{T>),  where  the  Gaussian  random,  function  5  i-^  Zg{6)  has  zero  m.ean  and 
covanance  function 

nz{S,6)  :=  I  I  nz{5,x,5,.i)dF"^{x)dF°^{x).  (3.24) 

Let  Qa{u)  =  inf{(5  :  F^iS)  >  u}  be  an  estimator  of  the  marginal  quantile  function  of  the 
effects  Qa{u).   Under  the  conditions  M.l,  M.2.  C.l,  Q.l,  and  R.P,  we  have: 

V^  {Qa{u)  -  Qa{u))  =>  -fA{QA{u))-'Zg{QA{u))  :=  Vg{u),  (3.25) 


This  assumption  rules  out  degenerated  distributions  for  the  distribution  of  effects,  such  as  constant 
treatment  effects.  These  "distributions"  can  be  estimated  using  standard  regression  methods. 
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m  the  space  f°°((0,  1)),  where  /a((5)  =  J;^;  fA{S\x)dFx{x)  and  the  Gaussian  randon^,  func- 
tion u  t— >  Vg{u)  has  zero  mean  and  covariance  function 

nv{u,u)  :=  /a(QaW)/a(Qa(h))Qz(QaH,Oa(u)).  (3.26) 

Example  2.  Quantile  regression.  To  illustrate  the  previous  analysis,  it  is  convenient 
to  consider  the  hnear  quantile  regression  model  where  (5y(u|x)  =  x'(5{u).  In  this  case, 
under  suitable  regularity  conditions  and  i.i.d.  sampling,  the  Koenker  and  Bassett  (1978) 
quantile  regression  estimator  satisfies 

V^  (/3(u)  -  /?(-a))  ^  J{u)-'B{u),  (3.27) 

where  B{u)  is  a  zero  mean  Gaussian  process  with  covariance  function  (min(u,'u,)  —  u  ■ 
u)E[XX']  proportional  to  the  covariance  function  of  a  Brownian  bridge,  and  J{u)  = 
E[fY{QY{u\X)\X)XX'].  Hence,  ■  /   .,     ^      ,      ■ 

Vn  (Qv-(u|x)  -  Qviulx)]  =  ^i  (x'/3('u)  -  x'P(u))  =>  V{u,x)  -  x'J(u)-'5(u),     (3.28) 

with  covariance  function  given  by:  -  .   ,    _  .         , 

T,v{u,x,u,x)  =  {mm{u,u)  —  u  ■  u)x'J{ii)^^E[XX']J{u)~^x.      ■  (3.29) 

Note  that  this  covariance  function  is  uniformly  bounded  under  our  assumptions  if  the 
Jacobian  J{u)  is  nonsingular  uniformly  in  u.  '  .    ■ 

The  covariance  function  for  Qyiu)  takes  the  form; 

.     '     w     -,       J  J  fy{Q\iu)\x)fy{Q'y{u)\x)Eviu,x,u,x)dFi,{x)dF^x{x) 

hy(u,u)  =  : ■ 5 ,  (j..iU) 

.      •         ._     .  .        [J  fYiQUu)\x)dFj,ix)] 


where 

.  -  fviylx)  ^ 


„=F.(.|.)       x'J{Fy{y\x))-^E[Xy       -  .         ^^-^^^ 


x'dp{u)/du 

see  proof  of  Theorem  3  in  Angrist,  Chernozhukov,  and  Fernandez- Val  (2006).  Similar 
expressions  can  be  obtained  for  the  covariance  functions  of  the  estimators  of  other  func- 
tionals  of  the  marginal  distributions  of  the  outcomes  and  effects.    In  Appendix  B  we 
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provide  consistent  estimators  for  the  components  of  these  expressions  that  can  be  used  to 
perform  pointwise  inference  about  specific  fea,tures  of  the  pohcy  effect  in  this  model. 

3.5.  Uncertainty  about  the  distribution  of  the  covariates.  The  previous  analysis 
assumes  that  the  distributions  of  the  covariates  before  and  after  the  policy  intervention 
are  known  in  the  population.  In  practice,  however,  we  usually  only  observe  a  sample  of 
the  covariates  and  outcome  before  the  intervention  and  a  sample  of  the  covariates  after 
the  intervention.  In  this  case  the  previous  limit  distribution  theory  is  still  valid  to  make 
inference  about  the  individuals  in  the  sample,  but  in  order  to  make  inference  about  the 
entire  population  we  need  to  take  into  account  the  additional  source  of  variation  coming 
from  the  estimation  of  the  distributions  of  the  covariates. 

Let  n/A-'  denote  the  sample  size  for  the  covariates  before  and  after  the  policy  interven- 
tion, where  j  indexes  the  status,  j  E  {o,  c}  and  we  normalize  A°  =  1.  We  make  the  basic 
assumption  that  the  estimator  of  the  distribution  function  of  the  covariates  x  h^  ^xix) 
converges  in  law  to  a  Gaussian  process: 

V^fF^.(x)-Fl.(:c))^A^Bi.(x),  ,.'  '■      (3.32) 


in  the  space  ^°°(A').  The  convergence  holds  jointly  for  all  estimators  indexed  by  the  status 
j  G  {o,  c}.  This  assumption  is  not  very  restrictive  as  it  is  satisfied  by  the  empirical  distri- 
bution function  under  general  sampling  conditions.®  The  joint  convergence  holds  trivially 
in  the  leading  cases  where  the  counterfactual  distribution  is  a  known  transformation  of 
the  observed  distribution,  or  when  the  two  distributions  are  estimated  from  independent 
samples. 

The  estimation  of  the  covariate  distribution  affects  the  inference  processes  for  the  func- 
tionals  of  interests.    Take,  for  example,  the  marginal  distribution  functions.    When  the 


Under  i.i.d.  sampling,  for  example,  B^  'S  a  P\--Brownian  bridge  with  zero  mean  and  covariance 
function  E[B-'^{x)B-'^{x)]  =  F\-  (min(2:,£))  —  Fx{x)Fx(x),  where  the  minimum  is  taken  componentwise; 
see,  e.g.,  Bilhngsley  (1968)  and  Neuhaus  (1971), 
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covariate  distribution  is  unknown,  a  feasible  estimator  for  these  functions  can  be  con- 
structed as  Fyiy)  =  /^  Fy(j/|a;)c!F^(x).  The  hmit  process  for  tliis  estimator  becomes: 

^  (F^:(y)  -  F,M2/))  =>  Z{y)  +  ^ I  Fy{y\x)dB'^{x),  (3.33) 

where  the  first  component  comes  from  the  estimation  of  the  conditional  model  and  the 
second  comes  from  the  estimation  of  the  distribution  of  the  covariates.  These  compo- 
nents are  independent  under  correct  specification  of  the  conditional  model  leading  to  the 
following  covariance  function  for  the  limit  process  under  i.i.d.  sampling: 

^z{y,y)  +  \'  [  [Fy{y\x)  -  F^-iy)]  [Fyiylx)  -  F^-iy)]  dF{{x).  (3.34) 


Similar  expressions  can  be  obtained  for  the  other  functionals  of  interest.  In  Appendix  C, 
we  re-derive  the  main  limit  distribution  results  for  the  case  where  the  covariate  distribu- 
tions are  estimated.  •  ',.  ,       ' 

3.6.  Uniform  inference  and  resampling  methods.  The  previous  limit  distribution 
results  can  be  readily  applied  to  make  inference  on  particular  features  of  the  distributions 
of  the  outcome  before  and  after  the  policy.  Thus,  for  example,  a  direct  implication  of 
Corollary  1  is  that  the  quantile  treatment  effect  estimator  for  a  given  quantile  u  is  dis- 
tributed asymptotically  as  normal  with  mean  QTEy{u)  and  variance  Tj\Y{u,u)/n.  We 
can  therefore  routinely  carry  out  pointwise  inference  on  QTEy{u)  with  the  normal  distri- 
bution replacing  the  unknown  components  of  Siv(u,  n)  by  consistent  sample  estimates. 

Pointwise  inference,  however,  only  permits  looking  at  specific  aspects  of  the  effect  of 
the  policy  separately.  This  might  be  restrictive  for  policy  analysis  where  the  quantities 
and  hypotheses  of  interest  usually  involve  the  entire  distribution  of  the  observed  and 
counterfactual  outcomes.  Thus,  for  example,  inference  questions  such  as  that  the  policy 
has  no  effect,  has  constant  effect,  or  has  beneficial  effect  cannot  be  tested  by  looking  at 
specific  quantiles  of  the  outcome  distribution.  Moreover,  simultaneous  inference  correc- 
tions to  pointwise  procedures  based  on  the  normal  distribution,  such  as  Bonferroni-type 
corrections,  can  be  very  conservative  to  perform  multiple  inference  for  highly  dependent 
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hypotheses.  These  procedures  are  also  no  longer  suitable  for  testing  a  continuum  of  hy- 
potheses. A  convenient  and  computationally  attractive  alternative  to  perform  inference 
on  functions  is  to  use  Kolmogorov-type  procedures  based  on  the  entire  limit  processes. 

Kolmogorov-type  inference  is  complicated  in  this  case  because  the  inference  processes 
are  non-pivotal,  as  their  covariance  functions  depend  on  unknown,  though  estimable, 
nuisance  parameters.  Moreover,  there  does  not  seem  to  be  a  simple  transformation  to 
make  these  limit  processes  distribution-free.  Similar  non-pivotality  issues  arise  in  a  variety 
of  goodness-of-fit  problems  studied  by  Durbin  and  others,  and  are  referred  to  as  the 
Durbin  problem,  by  Koenker  and  Xiao  (2002).  This  problem  makes  analytical  methods 
for  inference  more  difficult  to  implement.  The  limit  distribution  of  the  Kolmogorov  test 
statistics  can  be  simulated  replacing  the  covariance  functions  for  uniformly  consistent 
estimates,  but  this  procedure  can  be  computationally  cumbersome.  Moreover,  a  new 
simulation  is  needed  for  each  application  because  the  limit  processes  are  non-standard 
and  design-specific. 

A  way  to  partially  overcome  the  problems  of  the  analytical  approach  is  to  use  resampling 
methods.  Our  limit  distribution  results  rely  only  on  the  compact  differentiability  of  the 
functionals  of  interests  with  respect  to  the  limit  of  the  conditional  processes,  and  on 
these  conditional  processes  following  a  functional  central  limit  theorem.  An  additional 
advantage  of  this  general  approach  is  that  the  compact  differentiability  preserves  the 
validity  of  bootstrap  to  perform  inference  on  the  functional  of  interests,  if  the  underlying 
conditional  processes  are  bootstrap-able.  This  result,  stated  formally  in  the  next  corollary, 
follows  directly  from  the  functional  delta  method  for  the  bootstrap,  see,  e.g.,  van  der  Vaart 
(1998). 

Corollary  4  (Validity  of  bootstrap  for  uniform  inference).  //  the  conditional  processes 
(3.1),  (3.2),  and  (3.18)  satisfy  the  conditions  to  guarantee  the  validity  of  bootstrap,  then 
the  limit  processes  (3.6),  (3.9),  (3.12),  (3.14),  (3.16),  (3.23),  and  (3.25)  also  satisfy  these 
conditions. 
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The  bootstrap  procedure  can  be  computationally  intensive,  but  avoids  estimating  the 
components  of  the  covariance  functions.  Moreover,  if  the  sample  size  is  large  the  com- 
putational complexity  of  the  bootstrap  procedure  can  be  reduced  by  resampling  the  first 
order  conditions  of  the  estimators  of  the  conditional  models,  see  Parzen,  Wei,  and  Ying 
(1994)  and  Chernozhukov  and  Hansen  (2006);  or  by  using  subsampling,  see  Chernozhukov 
and  Fernandez- Val  (2006). 

4.  Illustrative  Examples 

4.1.  Empirical  example.  To  illustrate  the  applicability  of  the  previous  results  to  per- 
form inference  on  counterfactual  distributions  we  consider  the  estimation  of  expenditure 
curves.  We  use  the  Engel  (1857)  data  set,  originally  collected  by  Ducpetiaux  (1855)  and  Le 
Play  (1855)  from  235  budget  surveys  of  19th  century  working-class  Belgium  households, 
to  estimate  the  relationship  between  food  expenditure  and  annual  household  income  (see 
also  Perthel,  1975).  Ernst  Engel  originally  presented  these  data  to  support  the  hypoth- 
esis that  food  expenditure  constitutes  a  declining  share  of  household  income  (Engel's 
Law).  Here,  we  estimate  marginal  quantile  functions  of  food  expenditure  under  different 
distributions  for  the  annual  household  income.^ 

For  our  counterfactual  exercise  we  consider  two  distributions  of  income:  the  observed 
distribution  and  a  hypothetical  redistribution.  The  redistribution  consists  of  a  neutral 
reallocation  of  income  from  above  to  below  the  mean  that  reduces  the  standard  deviation 
of  the  observed  income  by  25%.  This  policy  can  be  implemented  by  the  progressive  income 
tax      _      ■ 

X'  =  E[X°]  +  .75{X° -E[X°]),  '  (4.1) 

which  yields  a  counterfactual  distribution  of  income  ■     -       _  . 

F^(x)  =  F^(E[X°]  +  (x-E[X°])/.75),         -"-    -"  (4.2) 


'^All  the  computations  were  carried  out  using  tlie  software  R  (R  Development  Core  Team,  2007)  and 
the  quantile  regression  package  quantreg  (Koenker,  2007). 
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where  F"-  is  the  observed  distribution  of  income  in  the  data  set.  The  observed  and 
counterfactual  distributions  of  income  are  plotted  in  Figure  1.  Here  we  can  see  th.at  the 
counterfactua]  distribution  is  a  mean  preserving  spread  of  the  observed  distribution  that 
reduces  standard  deviation  by  25%,  while  keeping  the  mean  constant. 

We  consider  three  different  conditional  models  for  the  relationship  between  food  ex- 
penditure and  annual  income:  a  linear  location-shift  model,  a  qua.ntile  regression  model, 
and  a  distribution  regression  model.  For  the  location  model  we  estimate  the  location 
parameters  by  least  squares  and  use  the  sample  quantiles  of  the  residuals  to  estimate 
the  quantiles  of  the  disturbance.  For  the  linear  quantile  model,  the  entire  conditional 
quantile  function  is  estimated  by  running  quantile  regressions  at  multiple  quantiles.  For 
the  distribution  model,  the  conditional  distribution  is  obtained  by  estimating  logits  of  the 
indicator  functions  1{Y  <  y}  on  the  covariates  for  a  grid  of  values  of  y  corresponding 
to  the  values  of  food  expenditure  in  the  data  set.  The  logit  estimates  are  monotonized 
by  rearrangement,  what  also  produces  monotone  estimates  of  the  conditional  quantile 
function. 

To  analyze  the  effect  of  our  hypothetical  "policy"  exercise  on  the  distribution  of  food 
expenditure.  Figures  2,3,  and  4  plot  90%  simultaneous  bands  for  the  marginal  quantile 
functions  of  food  expenditure  before  and  after  the  income  redistribution  based  on  the  three 
different  conditional  models.  These  uniform  bands,  which  allow  us  to  perform  inference 
on  the  functions  without  compromising  the  joint  confidence  level,  are  constructed  using 
500  bootstrap  repetitions  and  a  grid  of  quantiles  {0.10,  0.11, ...,  0.90}.  The  left  panels 
show  estimates  of  the  observed  and  counterfactual  quantile  functions  of  food  expenditure. 
For  the  three  conditional  estimation  methods,  the  redistribution  has  a  slightly  bigger 
effect  on  the  lower  tail  of  the  expenditure  distribution,  but  the  confidence  bands  are  only 
significantly  different  for  a  few  quantiles  in  the  quantile  regression  model.  The  right  panel 
refines  this  finding  by  plotting  90%o  simultaneous  bands  for  the  quantile  treatment  effects. 
Here  we  can  see  more  clearly  that  the  policy  has  a  positive  impact  on  the  lower  tail  of  the 
expenditure  distribution,  and  a  negative  effect  on  the  upper  tail.  This  result  is  consistent 
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with  a  consumption  pattern  where  food  expenditure  is  highly  correlated  with  income,  and 
all  the  households  adjust  their  levels  of  expenditure  to  the  changes  in  income. 

Since  in  this  case  we  have  a  perfectly  controlled  experiment,  we  can  also  estimate  the 
distribution  of  effects  of  the  policy.  Figure  5  plots  90%  confidence  bands  for  the  quantiles 
of  the  effects  based  on  the  three  conditional  estimators.  Here,  we  can  see  that  the  effects  of 
the  redistribution  can  be  large  and  very  heterogenous.  Figure  6  shows  that  the  progressive 
income  redistribution  reduces  food  expenditure  inequality  based  on  90%  confidence  sets 
for  Lorenz  curves  and  the  corresponding  Gini  coefficients.^ 

4.2.  Monte  Carlo.  We  conduct  a  Alonte  Carlo  experiment,  matching  closely  the  previous 
empirical  application,  to  illustrate  the  finite  sample  properties  of  the  policy  estimators.  In 
particular,  we  consider  a  data  generating  process  (DGP)  based  on  a  conditional  location- 
scale  shift  model:  Y  =  Z{X)'a.  +  (Z(X)'7)e,  where  e  is  independent  of  X,  with  true 
conditional  quantile  function 

'     ._      "    ■  •      Q(nlX)  =  Z(A7Q'+(Z(A')'7)QeM. 

The  regressor  vector  Z{X)  includes  a  constant  and  a  covariate  X,  namely  Z{X']  =  (1,  X)'. 
The  observed  distribution  of  X  corresponds  to  the  empirical  distribution  of  income  in  the 
Engel  data  set;  whereas  the  counterfactual  distribution  corresponds  to  a  neutral  income 
redistribution  that  reduces  the  standard  deviation  by  25%  implemented  as  in  (4.1).  The 
parameters  of  the  conditional  model  are  set  to  a  =  (624.15,0.55)  and  7  =  (1,0.0013). 
These  values  are  calibrated  to  match  the  Engel  empirical  example,  employing  the  esti- 
mation method  of  Koenker  and  Xiao  (2002). We  draw  1,000  Monte  Carlo  samples  of  size 
n  =  235  from  the  DGP.  To  generate  the  values  of  the  observed  outcome,  we  draw  ob- 
servations from  a  normal  distribution  with  the  same  mean  and  variance  as  the  residuals 


The  Gini  coefRcient  is  twice  the  area  between  the  45°  hne  (hne  of  perfect  equality)  and  the  Lorenz 
curve.  It  takes  values  between  0  and  1,  with  0  corresponding  to  perfect  equality  and  1  corresponding  to 
perfect  inequality. 
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e  =  (F  -  Z [Xy a) / [Z {Xy ^)  of  the  Engel  data  set;  and  we  draw  values  for  the  observed 
covariate  X°  from  the  empirical  distribution  of  income. 

We  consider  three  different  estimators  for  the  conditional  model  to  assess  the  properties 
of  the  policy  estimators  under  correct  and  incorrect  specification.  Thus,  in  each  replication 
we  estimate  the  conditional  distribution  or  quantile  function  using  a  correctly  specified 
linear  quantile  regression  model  and  misspecified  linear  location-shift  and  logit  regression 
models.  The  functionals  of  interest  are  the  quantile  functions  of  the  observed  and  coun- 
terfactual  outcome,  the  quantile  treatment  effects  function,  the  quantile  function  of  the 
distribution  of  the  effects,  Lorenz  curves  for  the  observed  and  counterfactual  outcome  dis- 
tributions, and  the  corresponding  Gini  coefficients.  For  each  of  the  conditional  estimators 
considered,  these  functionals  are  obtained  using  the  procedure  described  in  section  2.5, 
where  the  observed  and  counterfactual  distributions  of  the  covariates  are  estimated  by  the 
empirical  distribution  of  income  in  each  replication  and  by  applying  the  transformation 
in  (4.2)  to  this  empirical  distribution.  We  use  bootstrap  with  200  repetitions  to  obtain 
90%  confidence  bands  for  the  functionals  of  interest  in  each  replication. 

Table  1  reports  measures  of  the  bias  and  inference  properties  for  the  policy  estimators. 
The  integrated  bias  is  obtained  by  Monte  Carlo  average  of 

f{t)-k{t)\dt,    ^  (4.3) 

where  /o(f)  is  the  true  functional  and  /(i)  is  its  policy  estimator.  Coverage  probabilities 
correspond  to  Monte  Carlo  frequencies  that  the  90%  bootstrap  confidence  bands  include 
the  entire  functional  of  interest.  The  columns  labeled  as  Length  90%  CI  /  5-95  Range 
give  the  integrated  ratio  of  the  Monte  Carlo  average  length  of  the  90%o  confidence  band 
to  the  Monte  Carlo  5-95  quantile  spread  of  the  estimates. 

Overall,  the  results  in  Table  1  show  similar  patterns  to  the  empirical  example  with  the 
location  and  quantile  regression  models  giving  more  precise  (relative  to  the  5-95  quantile 
spread)  bands  than  the  distribution  regression  for  the  quantile  and  quantile  treatment 
effects  functions.  Misspecification  of  the  conditional  model  introduces  bias  in  the  policy 
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estimators,  specially  for  the  more  restrictive  location  model.  The  coverage  frequencies 
for  the  correctly  specified  quantile  regression  model  are  close  to  their  nominal  levels  for 
the  quantile  and  quantile  treatment  effects  functions,  even  for  a  sample  size  of  only  235 
observations.  The  logit  regression  has  also  coverage  frequencies  close  to  the  nominal  levels 
as  the  misspecification  bias  is  compensated  by  overestimation  of  the  size  of  the  confidence 
bands.  The  bands  for  the  Lorenz  curves  and  Gini  coefficients  have  generally  lower  coverage 
than  the  nominal  level. 

5.  Conclusion 

This  paper  provides  methods  to  make  inference  about  the  effect  on  an  outcome  of  inter- 
est of  a  change  in  the  distribution  of  policy-related  variables.  The  validity  of  the  proposed 
inference  procedures  in  large  sample  relies  only  on  the  applicability  of  a  functional  central 
limit  theorem  for  the  estimator  of  the  relationship  between  the  outcome  and  covariates. 
This  condition  holds  for  common  semiparametric  estimators  of  conditional  distribution 
and  quantile  functions,  such  as  quantile  regression  and  proportional  hazard  models.  It 
would  be  interesting  to  extend  the  analysis  to  the  case  where  the  assumptions  about  the 
conditional  model  are  relaxed  by  using  nonparametric  conditional  distribution  or  quantile 
estimators.  This  extension  is  the  object  of  current  resea.rch  by  the  authors. 
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Appendix  A.  Proofs    ~~ 

A.l.  Notation.  Define  Y^  :=  Qy{U\x),  where  U  ~  Uniform(ZY)  with  H  =  (0, 1).  Denote 
by  y^-  the  support  of  Y^^,  yx  :=  {{y,x)  :  y  G  3^i.,x  £  X],  and  UX  -.^U  x  X.  We  assume 
throughout  that  X  C  y,  which  is  compact  subset  of  R,  and  that  x  £  X,  a  compact  subset 
of  M''.  In  what  fohows,  £°°{UX)  denotes  the  set  of  bounded  and  measurable  functions 
h  :  UX  — >  R,  and  C{UX)  denotes  the  set  of  continuous  functions  mapping  h  :  UX  -^  M. 

A. 2.  Auxiliary  Lemmas. 

Lemima  3  (Equivalence  between  continuous  convergence  and  uniform  convergence).  Let 
D  and  D'  be  complete  separable  metric  spaces,  with  D  compact.  Suppose  f  :  D  -^  D'  is 
continuous.  Then  a  sequence  of  functions  f„:D-^  D'  converges  to  f  uniformly  on  D  if 
and  only  if  for  any  convergent  sequence  2;„  — >  x  in  D  we  have  that  /n(x„)  -^  f{x). 

Proof  of  Lemma  3:  See,  for  example,  Resnick  (1987),  page  2.  D 

Lemma  4  (Hadamard  Derivative  of  Fy'(y|x)  with  respect  to  Qy{u\x)).  Define  Fv-(y|x,  ht) 

:=  Jp  l{(5v(u|x)  +tht{u\x)  <  y]du.    Under  condition  C.l,  as  t  ^'  0, 

n    ^   i     .^       FY{y\x,ht)  -  Fy-(y|x) 

DhAyi'-^:^)  =  ^ ■' Dh{y\x),  (A.l) 

D,{y\x):=-fy{y\x)hiFy{y\x)\x).  "  (A2) 

The  convergence  holds  uniformly  in  any  compact  subset  ofyX  :=  {(y,x)  :  y  G  3^t,x  G  X}, 
for  every  \hi  -  h\^  -^  0,  where  ht  €  £°°  {UX),  and  h  G  C{UX). 

Proof  of  Lemma  4:  We  have  that  for  any  S  >  0,  there  exists  e  >  0  such  that  for 
u  G  B^{F)-(y\x))  and  for  small  enough  i  >  0 

l{Qy'(u|x)  +  thtiu\x)  <y}  <  l{Qr(^^|x)  +  t{h{FY{y\x)\x)  -  6)  <  y}; 

whereas  for  all  u  ^  i?e(Fv(y|x)),  as  t  ^  0, 

1{Qy{u\x)  +  tht{u\x)  <y}  =  l{Qr(u|x)  <  y). 
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Therefore, 

/o  l{QY{u\x)  +  tht{u\x)  <  y]du  -  J^  \{Qy{u\x)  <  y}du 
<      f  __    1{Qy{u\x)  +  tihiFyiy\x)\x)  -5)<y}-  l{Qy{u\x)  <  y} ^_^ 

JB,{Fy(y\x))  i 

which  by  the  change  of  variable  y'  —  Qy{u\x)  is  equal  to 

1    /■ 


i   J Jr\\y,y-t(h{Fy{y\x)\x)-i)\ 


fY{y'\x)dy, 


where  J  is  the  image  of  B^{FYiy\x))  under  u  i-^  Qy{-\x).  The  change  of  variable  is 
possible  because  Qr(-|x)  is  one-to-one  between  B^{FY{y\x))  and  J. 

Fixing  e  >  0,  for  f  ^  0,  we  have  that  J  Pi  [y,y  -  t{h{FY{y\x)\x)  -  5)]  —  \\),y  — 
t{h{FY{y\x)\x)  - 8%  and  /y(j/'|i)  -^  fY{y\x)  as  Fy'(y'|x)  -^  Fy(y|x).  Therefore,  the  right 
hand  term  in  (A. 3)  is  no  greater  than 

\       ■;■  ■  -fY{y\x){h{FYiy\x)\x)-6)  +  oii).  _    :,;■..:•;,: 

Similarly  — /y(y|x)  (/i(Fv(y|x)|x)  -I-  (5)  +  o  (1)  bounds  (A. 3)  from  below.  Since  S  >  0  can 
be  made  arbitrarily  small,  the  result  follows. 

To  show  that  the  result  holds  uniformly  in  (y,  x)  £  K,  a  compact  subset  of  yX,  we  use 
Lemma  3.  Take  a  sequence  of  (yt,  Xt)  in  K  that  converges  to  (y,  x)  €  K,  then  the  preceding 
argument  applies  to  this  sequence,  since  the  function  (y,x)  i— >  — /v-(y|x)h(F>'(ylr)|x)  is 
uniformly  continuous  on  K.  This  result  follows  by  the  assumed  continuity  of  h{u\x)  in 
both  arguments,  continuity  of  Fy(y|x),  and  the  assumed  uniform  continuity  of  /y(i/|x)  in 
both  arguments.  D 

A. 3.  Proof  of  Lemma  1.  This  Lemma  simply  follows  by  the  functional  delta  method 
(e.g.,  van  der  Vaart,  1998)  by  the  Hadamard  differentiability  of  Fy(y|x)  with  respect  to 
Qy(u|x)  shown  in  Lemma  4.  Instead  of  restating  what  this  method  is,  it  takes  less  space 
to  simply  adapt  the  proof  to  the  current  context. 


32 

Consider  the  map  gn{y,x\h)  =  y/n{FY{y\x,h/ yjn)  —  Fy{y\x)).  The  sequence  of  maps 
satisfies  gn'{y,x\h„i)  — >  Dh{y\x)  in  i°°[K)  for  every  subsequence  hn>  — +  /i  in  i'^iUX), 
where  h  is  continuous.  It  follows  by  the  Extended  Continuous  Mapping  Theorem  that, 
in  i°°[K),  gn{y,x\y/n{Q{u\x)  —  Qy{u\x)))  =>  Dv{y\x)  as  a  stochastic  process  indexed  by 
(y,x),  since  y/n{Q(u\x)  -  Qy[u\x))  ^  Viu,x)  in  i°°{UP(:).  D 

A. 4.  Proof  of  Theorem  1.  The  joint  uniform  convergence  result  follows  from  Condition 
D.l  by  the  Extended  Continuous  Mapping  Theorem,  since  the  integral  is  a  continuous 
operator.  Gaussianity  of  the  limit  process  follows  from  linearity  of  the  integral.  The 
derivation  of  the  mean  and  covariance  functions  of  the  limit  processes  is  standard  and 
therefore  we  omit  it.  D 

A. 5.  Proof  of  Theorem  2.  The  joint  uniform  convergence  result  and  Gaussianity  of 
the  limit  process  follow  from  Theorem  1  by  the  Functional  Delta  Method,  since  the  quan- 
tile  operator  is  Hadamard  differentiable  (see,  e.g.,  Fernholz,  1983,  and  Lemma  4).  The 
derivation  of  the  mean  and  covariance  functions  of  the  limit  processes  is  standard  and 
therefore  we  omit  it.  D  -  ■ 

A. 6.  Proof  of  Corollary  1.  This  result  follows  directly  from  Theorem  2  by  the  Extended 
Continuous  Mapping  Theorem.  The  deriva.tion  of  the  mean  and  covariance  function  of 
the  limit  process  is  standard  and  therefore  we  omit  it.  D  - 

A. 7.  Proof  of  Corollary  2.  This  result  follows  directly  from  Theorem  1  by  the  Extended 
Continuous  Mapping  Theorem.  The  derivation  of  the  mean  and  covariance  function  of 
the  limit  process  is  standard  and  therefore  we  omit  it.  D 

A. 8.  Proof  of  Corollary  3.  This  result  follows  directly  from  Theorem  1  by  the  Func- 
tional Delta  Method.  The  derivation  of  the  mean  and  covariance  function  of  the  limit 
process  is  standard  and  therefore  we  omit  it.  D 
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A. 9.  Proof  of  Lemma  2.  The  uniform  convergence  result  for  the  conditional  quantile 
process  \/n  ((5a(w|x)  —  Qa{u\x)  j  follows  from  Conditions  Q.l  and  R.P.  by  the  Extended 
Continuous  Mapping  Theorem.  Uniform  convergence  of  the  conditional  distribution  pro- 
cess \/n{F^{5\x)  —  F^{S\x))  follows  from  the  covergence  of  the  quantile  process  by  Func- 
tional Delta  Method.  The  Hadamard  differentiability  of  Fa{6\x)  with  respect  to  Q/^{u\x) 
can  be  established  using  the  same  argument  as  in  the  proof  of  Lemma  4.  The  expression 
for  /a((5Jx)  follows  from  Qa{u\x)  =  QY{u\g(x))  -  Qy{u\x),  Q'y{u\x)  —  l//y(Qy(u-|2;)jx), 
and  the  Inverse  Function  Theorem.  The  derivation  of  the  mean  and  covariance  functions 
of  the  limit  processes  is  standard  and  therefore  we  omit  it.  D 

A. 10.  Proof  of  Theorem  3.  The  uniform  convergence  result  for  the  distribution  func- 
tion follows  from  the  convergence  of  the  conditional  process  in  Lemma  2  by  the  Extended 
Continuous  Mapping  Theorem,  since  the  integral  is  a  continuous  operator.  Gaussianity 
of  the  limit  process  follows  from  linearity  of  the  integral.  The  uniform  convergence  result 
for  the  quantile  function  follows  from  the  convergence  of  the  distribution  function  by  the 
Functional  Delta  Method,  since  the  quantile  operator  is  Hadamard  differentiable  (see, 
e.g.,  Fernholz,  1983,  and  Lemma  4).  The  derivation  of  the  mean  and  covariance  functions 
of  the  limit  processes  is  standard  and  therefore  we  omit  it.  □  " 

A.  11.  Proof  of  Corollary  4.  This  result  follows  from  the  Functional  Delta  Method  for 

the  Bootstrap  (see,  e.g.,  van  der  Vaart,  1998).  D  ■ 

'   t. 

Appendix  B.  Linear  quantile  regression  model:  pointwise  inference 

.  In  order  to  make  pointwise  inference  on  the  marginal  quantile  functions  the  components 
of  the  expression  (3.30)  need  to  be  estimated.  The  difficulty  here  is  to  find  estimators  for 
Fy{Qy{u)\x),  J{u),  and  fy{QY{u)\x)  that  are  uniformly  consistent  in  u.  Chernozhukov, 
Fernandez- Val,  and  Galichon  (2006)  establishes  the  uniform  consistency  for  the  estimator 
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of  Fyiylx) 

FY{y\x)=  [   l{x'p{u)<y}dX{u),  (B.l) 

JO 

where  A  is  a  uniform  measure  over  a  fine  enough  grid  over  (0, 1).  Then,  Fy(y)  = 
f^y  Fyiylx) dFi.{x)  is  a  uniformly  consistent  estimator  of  Fyly)  by  the  Extended  Con- 
tinuous Mapping  Theorem,  and  Qy{u)  =  inf{y  :  Fyiy)  >  u]  is  a  uniformly  consistent 
estimator  for  and  Qyiu)  by  the  Functional  Delta  Method.  For  the  Jacobian  term,  J{u), 
Angrist,  Chernozhukov,  and  Fernandez- Val  (2006)  establishes  the  uniform  convergence  of 
Powell's  kernel  estimator.  Finally,  the  estimator 

friylx)  =  — x-^ ^  (B.2) 

x'J(Fy{y\x))-^Xr, 

where  A'„  is  the  sample  mean  of  the  observed  covariates  and  J(-)  is  Powell's  kernel  esti- 
mator of  the  Jacobian,  is  uniformly  consistent  for  the  conditional  density  by  the  Extended 
Continuous  Mapping  Theorem. 

Appendix  C.  Limit  Distribution  Theory:  Estimated  Covariate 

Distributions 

We  start  by  restating  the  Condition  D.l  to  incorporate  the  assumptions  about  the 
estimators  of  the  covariate  distributions. 

D.l'  Let  Z{y.x)  :=  y^(Fy(yix)  -  Fy{y\x)),  and  B^-(x)  :=  x/^(F^(x)  -  F^(x))  for 
j  G  {o,c}.  These  processes  converge  jointly  in  law  to  a  multivariate  continuous 
Gaussian  process  having  independent  increments: 


in  the  space  C^{y  x  X  x  Pd  x  X);  where  the  random  function  (y,  x)  i-^  Z{y,  x)  has 

zero  mean  and  uniformly  bounded  covariance  function  T,z{y,  x,  y,  x)  :—  E[Z{y,  x)Z{y,  x)], 

the  random  functions  x  y-*  B]^{x),  for  j  G  {o,c},  have  zero  means  and  uniformly 
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highly  dependent  Brownian  bridges.     The  second  components  of  the  covariance 
functions  are 

I  \Fy[y\x)  -  F^.{y)]  [Fy{y\x)  -  F°.{y)]  dF"^{x),  (C.6) 

Jx 

for  the  observed  outcome, 

[Fy{y\g[x))  -  F^{y)]  [FY[y\g{x))  -  F^iy)]  dF°^{x),  (C.7) 


for  the  counterfactual  outcome,  and  the  second  component  of  the  cross  covariance 
function  is  ,    .  ■  ;     _ 

,-        ljFY{y\x)-F^iy)][Fy{y\9ix))-F^{y)]dF^ix).'  •  (C.8) 

Proof  of  Theorem  4:  The  stochastic  process  for  the  distribution  function  \/n{Fy{y)  — 
Fyiy))  can  be  decomposed  in  three  components:  '    , .         \' 

/  Z{y,x)dF'^ix)  +   I  Fy{y\x)dB\.{x)  -^   !  Z{y,x)dB'^{x).  (C.9) 

Jx  Jx  v"^  Jx 

The  first  component  converges  uniformly  to  Z{yy  by  Theorem  1.  The  uniform  conver- 
gence of  the  second  component  to  a  Gaussian  process  foUov/s  from  the  convergence  of 
empirical  stochastic  integrals  by  condition  D.l'  (see,  e.g.,  DeJong  and  Davidson,  2000), 
and  standard  Ito's  results  for  stochastic  integrals  of  Gaussian  processes  since  Fy(y|2;)  is 
uniformly  continuous  in  both  arguments.  The  third  term  is  of  order  Op{l)  uniformly  in 
y  since  the  stochastic  integral  J^  Z{y,x)dB\r[x)  is  bounded  in  probability  uniformly  in  y 
by  condition  D.l'  (see,  e.g.,  DeJong  and  Davidson,  2000).  This  result  follows  because  the 
random  function  Z{y,x)  is  square  integrable  uniformly  in  y  (see,  e.g.,  Proposition  7.41  in 
White  (2001)),  since  J^'£z{^^y,x,y)dx  is  bounded  uniformly  in  y  by  condition  D.l'  and 
compactness  of  X.  O  ■  ■'  •. 

Theorem  5  (Limit  Distribution  for  Marginal  Quantiles).  Under  the  conditions  M.l, 
M.2.   C.l.  and  D.l '  the  estimators  of  the  marginal  quantile  functions  converge  in  law 
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to  the  following  continuous  linear  Gaussian  functional: 

v^  {QI-{u)  -  Ql-{u))  =>  -fl-{Q\-{u))-'Z^{Q\.{u))  :=  V^{u).  (C.IO) 

in  the  space  ^°°((0, 1)),  where  fyiy)  =  j.^  fY{y\x)dFi,{x)  and  the  random  function  u  h-» 
V'^(u)  has  zero  mean  and  covariance  function 

tiiu^u)  :=  fi.iQi.{u))-\fliQi.{u)r'%iQUu),QUu)).  (C.ll) 

The  convergence  holds  jointly  for  all  estimators  indexed  by  the  status  j  E  {o^c]-,  with 
cross-covariance  function 

^f{u,u):=f^.iQ°y{u))-'fUQl.{u))-'tfiQ°yiu),Q^y{u)).  (C.12) 

Proof  of  Theorem  5:  The  joint  uniform  convergence  result  and  Gaussianity  of  the  limit 
process  follow  from  Theorem  4  by  the  Functional  Delta  Method,  since  the  quantile  oper- 
ator IS  Hadamard  differentiable  (see,  e.g.,  Fernholz,  1983,  and  Lemma  4).  The  derivation 
of  the  mean  and  covariance  functions  of  the  limit  processes  is  standard  and  therefore  we 
omit  it.  D  . 

Corollary  5  (Limit  Distribution  for  Quantile  treatment  Effects).  Under  the  conditions 
M.l,  M.2,  C.l,  and  D.l'  the  estimators  of  the  quantile  treatment  effects  converge  in 
law  to  the  following  linear  functional  of  continuous  Gaussian  processes: 

V^  (oTEyiu)  -  QTEy{u))  =>  V'{u)  -  V^u)  :=  W{u),  (C.13) 

in  the  space  C°°{{0, 1)),  where  the  random  function  u  i— *  W(u)  has  zero  mean  and  covari- 
ance function  ... 

..._  _  tw{u,u):=t°y{u,u)  +  t'y{u,u)-tmu,u)-t°y%u,u).  (C.14) 

Proof  of  Corollary  5:  This  result  follows  directly  from  Theorem  5  by  the  Extended 
Continuous  Mapping  Theorem.  The  derivation  of  the  mean  and  covariance  function  of 
the  limit  process  is  standard  and  therefore  we  omit  it.  D 
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Corollary  6  (Limit  Distribution  for  Distribution  Effects).  Under  the  conditions  M.l, 
M.2,  and  D.l  the  estimator  of  the  distribution  effects  converges  m  law  to  the  following 
linear  functional  of  continuous  Gaussian  processes: 

v^  (DEyiy)  -  DEviy))  ^  Z^u)  -  Z°{u)  :=  S{y),  (C.15) 


in  the  space  (!°°{y),  where  the  random  function  y  y-*  S{y)  has  zero  mean  and  covanance 
function    ^  ,.         •,       ,  • 

•.        •       i;5(y,y):=f:°^(y,y)  +  E|(y,y)-Ef(y,y)-E°2^(y,y).  ;    (C.16) 

Proof  of  Corollary  6:  Tfiis  result  follows  directly  from  Theorem  4  by  the  Extended 
Continuous  Mapping  Theorem.  The  derivation  of  the  mean  and  covariance  function  of 
the  limit  process  is  standard  and  therefore  we  omit  it.  D  _  ■ 

Corollary  7  (Limit  Distribution  for  Differentiable  Functionals).  Let  Hyiv)  =  0(i^y ,  FY,y) 
be  a  Hadamard  differentiable  functional  in  the  first  two  arguments,  with  derivatives  cp'^  and 
(f)'^  with  respect  to  the  first  and  second  argument.  Under  the  conditions  M.l,  M.2.  and 
D.l'  the  estimator  of  the  functional  Hyiy)  defined  m  (2.17)  converges  in  law  to  the 
following  linear  functional  of  continuous  Gaussian  processes:  ■  ■ 

V^fi^v(y)-Fy(y))  ^0;(F°,F,^y)Z°(y)+<^;(F°,F,^y)Z%)  :=  f  (y),        (C.17) 


in  the  space  ('^{y),  where  the  random,  function  y  i— >  T{y)  has  zero  m.ean  and  covariance 
function 

tT{y.,y):=4>'o4>'o^''z{y^y)  +  <t^'M%iy^y)  +  4>'o€^^^^  (Cis) 

where  4>',  :=  (p',[F^.  F^,y)  and  4>'^  :=  (^^(F^.,  F{.,y),  for  ]^{o,c]. 

Proof  of  Corollary  7:  This  result  follows  from  the  Functional  Delta  Method  (see,  e.g., 
van  der  Vaart,  1998).  D  .  ■ 
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Corollary  8  (Validity  of  bootstrap  for  uniform  inference).  //  the  limit  process  in  (C.l) 
satisfies  the  conditions  to  guarantee  the  validity  of  bootstrap,  then  the  limit  processes  (C.2), 
(C.IO),  (C.13),  (C.15).  and  (C.17)  also  satisfy  these  conditions. 

Proof  of  Corollary  8:  This  result  follows  from  the  Functional  Delta  Method  for  the 
Bootstrap  (see,  e.g.,  van  der  Vaart,  1998).  D 

Appendix  D.  Verification  of  regularity  conditions  for  common 

conditional  model  estimators 

Example  1.  Location  regression.  Consider  the  linear  location  regression  model 
Y  =  X'P  +  V,  where  the  disturbance  V  is  independent  of  A',  with  mean  zero,  variance 
ay  and  quantile  function  Qy{u).  In  this  case,  the  location  parameter  /3  can  be  estimated 
by  OLS  and  the  quantiles  of  V  can  be  estimated  by  the  sample  quantiles  of  the  OLS 
residuals.  The  estimator  of  the  conditional  cdf  of  Y  is  therefore 

Fy{y\x)  =  Fyiy-x'P).  (D.l) 

Under  suitable  regularity  conditions  and  i.i.d.  sampling,  by  Theorem  2  in  Durbin  (1973) 
we  have 

^i(^Fy{y-x'p)-Fviy-x'P))=>Z{y,x),  (D.2) 

in  the  space  i°°{y  x  A'),  where  Z{y,x)  is  a  Gaussian  process  with  covariance  function 

Sz(y,x,y,x)    =     [mm(Fv(2/|x),Fy(y|x))  -  Fy(y|.T)Fy(yli)]  (D.3) 

-fY{y\x)fY{y\x)alx'E[XX']'^x, 

since  Fy(y|.7;)  =  Fv{y  —  x' (3)  and  fviylx)  =  fv{y  —  x'l3).  This  covariance  function  is 
uniformly  bounded  under  our  assumptions  if  i?[A'A'']  is  nonsingular. 

Example  3.  Duration  Models.  Dabrowska  (2005)  gives  regularity  conditions  for 
weak  convergence  of  quantile  regression  estimators  of  transformation  models  including 
proportional  hazard  models. 
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Example  4.  Distribution  Regression.  Consider  the  conditional  model  for  the  dis- 
tribution function  Fy'(j/|x)  =  A(x'/?(y)),  where  A  is  a  known  link  function  (e.g.,  logit  or 
probit).  The  function  f3{y)  can  be  estimated  by  running  logit  or  probit  regressions  of  indi- 
cator variables  l{y'  <  y}  on  the  regressor  vector  X  (see,  e.g.,  Foresi  and  Peracchi,  1995). 
Under  i.i.d.  sampling  and  other  regularity  conditions,  the  estimator  of  /3(y)  satisfies,  in 
the  space  £°° (3^), 

^  (ky)  -  Piy))  =>  -J{y)-'B{y),  (D.4) 


where  J{y)  =  E[A'[X' (3{y))-XX' / {K{X' f3{y))[l  -  k{X' (3{y))))],  A'(-)  is  the  derivative  of 
A(-),  and  B{y)  is  a  zero  mean  Gaussian  process  with  covariance  function 

K'{X'(5{y))K'{X'(3{y)) 


Ay^y) 


-XX' 


(D.5) 


LA(X'^(y))(l-A(X'/3(y)))- 
for  y>y.  Hence,        '      .        '  .        /  ^'  "  -  . 

-        ..     V^  [Fy{y\x)  -  Fyiy\x))  ^  Z{y,x)  =  -A'{x'p{y))x'J{y)-'B{y),  (D,6) 

in  the  space  ^°°(3^x  A:'),  where  Z{y,  x)  is  a  Gaussian  process  with  zero  mean  and  covariance 
function: 

Ez{y,x,y,x)^A'{x'P{y))A'ix'(3iy))x'J{y)-'^B{y,y)J{y)-'x.  (D,7) 

This  covariance  function  is  uniformly  bounded  under  our  assumptions  if  J{y)  is  nonsin- 
gular  uniformly  in  y. 

For  example,  in  the  case  of  the  logit  A(u)'  =  A('u)(l  -  A(u)).  The  asymptotic  variance 
of  the  estimator  of  the  marginal  distribution  based  on  the  conditional  model  is 

;.      .      ^z{y,y)    =    E[A'{X'Piy))X]'E[A\X'p{y))XXr'E[A'{X'P{y))X] 

"•"'       '  =    E[A{X'l3{y)){l-.A{X'P{ym.  (D.8) 

If  the  covariate  distribution  is  estimated  then  the  asymptotic  variance  becomes 


E[A{X'P{y)){l  -  A(X'/?(y)))]  +  E    (A(X'/3(y))  -  E[A{X'P{y))]) 


(D.9) 
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which  is  the  same  as  the  asymptotic  variance  of  the  empirical  marginal  distribution  func- 
tion of  y.  The  logit  estimates  indeed  not  only  have  the  same  asymptotic  variance  but  are 
also  numerically  identical  to  the  empirical  distribution  estimates  since 

^       n  ^       n 

Fy{y)  =  -J2Hx:My))  =  -Y.i{y.  <  y}:  (D.IO) 

!=1  1=1 

where  the  last  equality  follows  from  the  first  order  conditions  of  the  logit  if  the  regressor  A' 
includes  a  constant  term.  Note  that  this  result  holds  regardless  of  whether  the  conditional 
model  is  correctly  specified  or  not. 
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Figure  1.  Observed  and  counterfactual  empirical  distribution  for  house- 
hold income  for  the  Engel  food  expenditure  data.  The  counterfactual  distri- 
bution is  constructed  as  a  neutral  reallocation  of  the  observed  income  from 
above  to  below  the  mean  such  that  yields  a  25%  reduction  in  the  standard 
deviation,  that  is  F^{x)  =  F^iElX"]  +  {x  -  E{X°])/.75).   ' 
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Figure  2.  Simultaneous  90%  confidence  bands  for  quantile  functions 
using  tfie  Engel  food  expenditure  data;  Location-shift  model.  Counterfac- 
tual exercise  consists  of  a  mean-preserving  spread  of  income  that  reduces 
standard  deviation  by  25%.  Right  panel  plots  uniform  bands  for  quantile 
treatment  effects.  Left  panel  shows  uniform  bands  for  the  observed  and 
counterfactual  quantile  functions  of  food  expenditure. 
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Figure  3.  Simultaneous  90%  confidence  bands  for  quantile  functions 
using  the  Engel  food  expenditure  data:  Quantile  regression  model.  Coun- 
terfactual exercise  consists  of  a  mean-preserving  spread  of  income  that  re- 
duces standard  deviation  by  25%.  Left  panel  shows  uniform  bands  for  the 
observed  and  counterfactual  quantile  functions  of  food  expenditure.  Right 
panel  plots  uniform  bands  for  quantile  treatment  effects. 
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Figure  4.  Simultaneous  90%  confidence  bands  for  quantile  functions  us- 
ing the  Engel  food  expenditure  data:  Distribution  regression  model.  Coun- 
terfactual exercise  consists  of  a  mean-preserving  spread  of  income  that  re- 
duces standard  deviation  by  25%.,  Left  panel  shows  uniform  bands  for  the 
observed  and  counterfactual  quantile  functions  of  food  expenditure.  Right 
panel  plots  uniform  bands  for  quantile  treatment  effects. 
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